Introduction
1. a) What is the difference between data and information?
Data are raw facts. Information is meaning extracted from data.
b) How can data be protected while it is being transmitted?
It can be encrypted. (e.g., using a cryptographic system).
c) How can data be protected while it is being processed?
Data can be protected by making sure applications are securely coded and hosts are hardened.
d) What are some ways that data can be attacked when it is stored?
It can be accessed by unauthorized persons, destroyed, copied without permission, and taken outside the organization (i.e., data loss).
e) How can data be protected while it is being stored?
It can be properly backed-up, encrypted, and when necessary, destroyed. Restrictions on access to the data can also be implemented while it is being stored.
Data Protection: Backup
The Importance of Backup
2. a) List the ways in which data can be lost, adding some of your own if you can.
Data can be lost by mechanical failure, environmental casualties, malware, lost or stolen devices, and human error.
b) How does backup ensure availability?
Backup will ensure availability because you will be able to still access your files from backup copies, even if your primary hard disk fails.
c) Have you ever had to use a backup to restore a file? Explain.
Student responses will vary.
Scope of Backup
3. a) Distinguish between file/directory data backup and image backup.
File/directory data backup copies data (not programs, registry settings, configurations). Image backup copies data and all those other things.
b) Why is file/directory backup attractive compared with image backup?
File/directory backup is more attractive compared to image backup because it takes up less storage space and is much faster.
c) Why is image backup attractive compared with file/directory data backup?
Image backup is attractive because it requires minimal additional work to restore a functioning, fully capable PC.
d) What is shadowing?
Shadowing frequently records a backup copy of each file actively being worked. If there is a failure, little will be lost.
e) What is the advantage of shadowing over file/directory data backup?
The advantage of shadowing is that it allows for more current file changes to be restored.
f) How is shadowing limited?
Shadowing is limited because when the capacity of the storage is exceeded, the oldest files are deleted first.
Full versus Incremental Backups
4. a) Why don’t most companies do full backup every night?
Full backups take a long time and thus companies usually only conduct full backups weekly.
b) What is incremental backup (be precise)?
Incremental backups only backup data that has changed since the most recent full backup.
c) A company does a full backup one night. Call this backup Cardiff. On three successive nights, it does incremental backups, which it labels Greenwich, Dublin, and Paris. In restoration, what backups must be restored first and second?
Cardiff, then Greenwich. (Dublin and Paris come next.)
Backup Technologies
5. a) What are the advantages of centralized backup compared with local backup?
Centralized backup alleviates the key problems associated with local backup, which are:
Limited ability to enforce backup policy
Limited ability to audit which computers were backed up per policy, how backups were done, or how data was protected.
b) What is CDP (do not just spell out the acronym)?
CDP is continuous data protection. This is where two sites backup each other
c) Why is CDP attractive?
CDP is attractive because other sites can take over very quickly in case of a disaster, with little data loss.
d) Why is it expensive?
CDP is expensive because ample bandwidth is needed between CDP sites to allow the real-time backup of data.
e) Why is backup over the Internet to a backup storage provider attractive for client PC users?
The main reason for this being attractive is because it is very convenient.
f) What security risk does it create?
There is the concern that the company owning the PC loses control over its data, which is a very large security risk.
g) What is mesh backup?
Mesh backup is peer-to-peer backup onto other client computers. It sends backup data in parcels to many other client PCs.
h) What are its technical challenges?
First, the mesh backup operation must not slow down the computer on which packets are being written, or from which packets are being retrieved. Second, specific client PCs are not always available for packet retrieval, so parcels need to be sent out redundantly. The most difficult technical problem is security. When a client PC receives a backup parcel, its user must not be able to read, modify, or delete it.
i) Why is mesh backup desirable?
Mesh backup is desirable because it could make client PC backup automatic and, thus, eliminate the human factor in failing to conduct regular backups. It also utilizes corporate PC power that is often underused, compared to expensive separate backup hardware.
Backup Media and RAID
6. a) Why is magnetic tape desirable as a backup medium?
Magnetic tape can store vast amounts of data at the lowest cost per bit of any backup medium.
b) Why is tape not desirable?
Tape is not desirable because it is painfully slow and there are many different tape formats and readers (not much standardization like optical media).
c) Why is backup onto another hard drive attractive?
This method is attractive because it is a very fast method of backup.
d) Why is it not a complete backup solution?
This is not a complete solution because it could also be lost if computer is stolen or damaged in a fire. This method is also too expensive for long-term storage.
e) How can this limitation be addressed?
Many companies use a hybrid backup method, using additional hard drives for storage for as long as possible then transferring to tape at a pre-determined time or data size.
f) How much data can be stored on a dual-layer DVD?
Up to 8GB.
g) What is the advantage of burning backup data onto optical disks?
The advantage would be that mostly all users have optical disk burners.
h) Is storing backups on optical disks for several years likely to be safe?
Probably not, because the life of optical disks is still unknown and probably is short.
Disk Arrays—RAID
7. a) How can disk arrays ensure data reliability and availability?
A system using an array of drives increases reliability because redundant data are stored on multiple disks. Failure of a single drive in the array would not precipitate data loss. An array of drives can also increase read-write performance. Disk performance is increased because data can be written to, or read from, multiple disks simultaneously.
b) Explain RAID 0.
A RAID 0 configuration increases data transfer speeds and capacity by writing simultaneously to multiple hard disks. Writing data across multiple disks is known as striping. The striped set of disks is fast, but offers no reliability. If one of the drives fails, data on all disks are lost.
c) Explain RAID 1.
A RAID 1 configuration, the client operating system writes data to both the primary hard drive and the backup hard drive at the same time. No striping is used, so data transfer speeds remain approximately the same. Storage capacity also remains the same because the additional drive is just a mirror of the primary drive.
d) Explain RAID 5.
A RAID 5 configuration stripes data across multiple disks to increase data transfer speeds. Reliability is provided by parity bits that enable reconstruction of data stored on other disks. A RAID 5 configuration can recover from a single drive failure, but not a multi-drive failure.
Computing Parity
8. a) What is parity?
Parity bits enable reconstruction of data stored on other disks in case of failure. Parity bits are stored on disks such that they can be used to reconstruct the original parts of any lost disk in the array.
b) How does the XOR operator work?
The XOR parity bit will be a 1 if one or the other bits is a “1” but not both bits are 1s. The parity bit will be a 0 if both bits are either “1” or “0.”
c) How can parity be used to restore lost data?
Suppose Disk 3 (of 3) experienced complete failure. Data from Disk 1 (Part 1, Part 3, and Parity 5&6), together with data from Disk 2 (Part 2, Part 5, and Parity 3&4) could be used to recalculate the lost data on Disk 3 (Part 4, Part 6, and Parity 1&2). No data would be lost. After all calculations are done, the data on new Disk 3 will be identical to the data before the fire.
d) How long would it take to recalculate the data on a lost disk?
It depends on the number of disks, size of the disks, read/write speeds, etc. It could take anywhere from a several hours to several days. Rebuild times vary widely.
9. a) What are the advantages of RAID 5 over RAID 1?
A small amount of storage capacity is lost by storing the parity bits (RAID 5), but it is much less than it would be if the entire array were mirrored (RAID 1). The recovery/rebuild time on RAID 5 would be much longer than on RAID 1. No recalculation of the lost data is necessary in RAID 1.
b) Which RAID level discussed in this chapter has the fastest read-write speeds?
RAID 0
c) Is RAID 5 appropriate for home users? Why, or why not?
The typical consumer fairly price conscious and not overly worried about data loss. They prefer to not think about their brand new computer failing. They really don’t want to pay for two additional hard disks. RAID 5 would be great for home users, but it is likely too expensive and difficult to configure. The tech-savvy end user could have a RAID 5 configuration at home, but most end users won’t.
Data Storage Policies
10. a) What should backup creation policies specify?
Backup creation policies should specify what data should be backed up, frequency of backups, restoration test periodicity, and other related guidance.
Policies should address different types of data and computers to ensure the right backup is provided for the resource.
b) Why are restoration tests needed?
Restoration tests are needed because if the data is important enough to spend precious time and resources to backup, it needs to be available when called upon. Not testing will almost guarantee some failure that could have been addressed with simple testing.
c) Where should backup media be stored for the long term?
Backup media should be stored at another site.
d) What should be done about backup media until they are moved?
Backup media should be stored in a fireproof and waterproof safe until they are moved.
e) Why is the encryption of backup media critical?
Encryption of backup media is critical because backup data can be lost or stolen; encrypting the data protects the company from expensive losses of PII or trade secrets.
f) What three dangers require control over access to backup material?
The dangers of loss, stolen, or damaged backup material require control over access to the data
g) If Person A wishes to check out backup media, who should approve this?
The manager of the person requesting media should approve the checkout.
h) Why are checkouts of backup media suspicious?
Checkouts of backup media are rare, so every checkout should be examined carefully. There must be a valid operational reason for retrieving the backup media.
i) Why should business units and the legal department be involved in creating retention policies?
There are many business and legal requirements on the retention of certain types of data; these departments should be involved in creating retention policies.
j) What should backup audits include?
Backups, like all other processes, require audits to make sure the established policies are being followed. Backup audits should examine backups for compliance with policy, including tracing what happened to samples of data that should have been backed up.
E-Mail Retention
11. a) Why is retaining e-mail for a long period of time useful?
Retaining e-mail is useful in that it provides a history to be searched.
b) Why is it dangerous?
It is dangerous because if it can be searched, info in the e-mail could be used against you.
c) What is legal discovery?
Legal discovery is the process where a firm must provide records related to a lawsuit, including e-mails.
d) What are courts likely to do if it would be very expensive for a firm to discover all of its e-mail pertinent to a case?
Courts do not care; the firm will have to pay to get the e-mails recovered.
e) What can happen if a firm fails to retain required e-mail?
A firm can be fined or lose a lawsuit if they fail to retain e-mail.
f) What is accidental retention?
Accidental retention is when e-mail or other files are located on backup tapes when they were thought to be deleted.
g) How long can third-party e-mail providers keep your e-mails?
Indefinitely.
h) Is there a specific law that specifies what information must be retained for legal purposes?
No, there are multiple laws that specify what information must be retained.
i) What two requirements in the U.S. Rules of Civil Procedure are likely to cause problems for firms who do not have a good archiving process?
In the initial discovery meeting, which occurs shortly after a lawsuit begins, the defendant must be able to describe what information it has and how it will provide it. This requires a good in-place archiving system.
The firm must be able to put a hold on all destruction of potentially relevant information if the possibility of a lawsuit is foreseen. It needs a good in-place archiving system to be able to do this.
j) Why is message authentication important in an archiving system?
Without this, there is no way to prove who sent a particular e-mail.
k) Comment on a corporate policy of deleting all e-mail after 30 days.
Deleting e-mail after 30 days will inevitably get the firm in big trouble, fines, and the possibility of losing lawsuits.
User Training
12. a) Are e-mail messages sent by employees private?
No e-mails sent over corporate resources should be considered private.
b) What should employees be trained not to put in e-mail messages?
Employees should be trained not to say anything in e-mail that they would not want to see in court.
Spreadsheets
13. a) Why is spreadsheet security an IT security concern?
Spreadsheets are a major focus of new compliance regulations resulting from the Sarbanes–Oxley Act of 2002. Spreadsheets are highly concentrated with PII and are used for many proprietary financial calculations that would be disastrous to have hacked or deleted. Also, manipulation of spreadsheets is a key technique of criminals attempting fraud.
b) What two protections should be applied to spreadsheets?
The two protections are extensive testing for errors and fraud indicators and the use of spreadsheet vault servers.
c) Briefly list the functions of a vault server.
Spreadsheet vault servers offer the following:
• provide strong access control, authentication, authorizations, and auditing
• limit access to what a particular user can do and see in a spreadsheet
• extensive auditing
• version control of files
• cryptographic protection between PC and server
• active detection tools looking for policy violations
• strong management tools to state and update policies
d) Comment on vault server authorizations.
Vault server authorizations go beyond what a person can do with a file. They also limit what files a user can see on the spreadsheet (such as the user can see values in the cells but not the formula that created them).
e) Describe vault server auditing.
Vault server auditing includes check in/check out of files and individual cell changes. Individual cell change logs can be used for forensic analysis when required.
Database Security
14. a) What is a relational database? Explain.
Relational databases store their data in relations, commonly referred to as tables. Entities are types of objects that represent persons, places, things, or events. Entities are things (nouns) you want to store information about. Attributes—columns in the table—are characteristics (adjectives) about the entity that you want to collect.
b) Why would a database administrator want to restrict access to certain tables?
For example, an accountant at a hospital would not need to see tables containing patient medical records. Even though the accountant may be an honest and hard-working employee, his job does not require access to tables containing patient medical records. Conversely, a medical doctor would not need access to tables containing financial data.
c) Why would a database administrator want to restrict access to certain columns?
For example, only a few employees might be allowed to retrieve information in the salary column. Other employees could read other attributes in the Employee table, but not salaries.
d) Why would a database administrator want to restrict access to certain rows?
For example, access to employee data can be restricted to rows for each department. Each department could view their own employee records, but not the records for other departments.
e) How would limiting data granularity protect the underlying database?
For instance, in analyzing data for personnel, privacy concerns may restrict searches to being no more detailed than sums and averages at the department level.
f) What is a data model?
A data model consists of entity names, attributes, and the structure of relationships between entities.
Database Access Control
15. a) What is a DBMS?
A database management systems (DBMS) such as Microsoft SQL Server,
MySQL, IBM DB2, and Oracle can manage database structures, and restrict access to individual databases.
b) Can a DBMS manage multiple databases? Why?
Yes, a DBMS (like Microsoft Access) can access many different databases. The DBMS simply manages access to the database.
c) How can validation protect against a SQL injection attack?
Incoming data can be validated by making sure they are in the expected data type (e.g., text, integer, or binary), size (e.g., 32 bits, 10 characters, or less than 5 KB), or format (e.g., DD/MM/YY or (555)555-5555).
d) How can sanitation protect against a SQL injection attack?
Incoming data can be sanitized to remove unacceptable characters that could be used to manipulate the SQL statement.
Database Auditing
16. a) What types of database events should be audited?
Logins, changes, warnings, exceptions, and special access should be audited.
b) How could SQL triggers be used to secure a database?
Triggers can also be used to implement audit policies and detect noncompliance with security policies.
c) What is a DDL trigger?
Data Definition Language (DDL) triggers can be used to produce automatic responses if the structure of the database has been altered.
d) What is a DML trigger?
Data Manipulation Language (DML) triggers can be used to produce automatic responses if data have been altered.
e) What type of sensitive data exists at your organization?
Answers will vary.
Database Placement and Configuration
17. a) What is a multi-tiered architecture? Why is it important?
A multi-tiered architecture separates the presentation (webserver), application processing (middleware server), and database management (database server) functions. It protects other layers by separating functions. If one layer is compromised, it won’t necessarily compromise the other layers.
b) How could a multi-tiered architecture stop or mitigate the effects of an attack?
A tiered architecture provides a greater level of protection to the database because vulnerabilities or attacks on one layer won’t necessarily affect other layers. For example, a DoS attack against the webserver won’t overwhelm the database server and shut down the database.
c) Why is changing the default database listening port important?
Attackers use automated port scanners to look for databases running on known default ports. Changing the default port can slow down an attacker
Data Encryption
18. a) Why is encryption usually attractive for sensitive data from a legal standpoint?
From a legal standpoint, loss of encrypted data comes with negligible risk of exploitation and the costly responsibility to report lost unencrypted data.
b) How long must an encryption key be to be considered strong today?
An encryption key must be at least eight characters long to be considered strong today.
c) What happens if the encryption key is lost?
If the key is lost, this could be disastrous for the firm. People with legitimate access will also be locked out.
d) How do companies address this risk?
Companies address the risk of lost encryption keys by holding copies of the keys in escrow, either on site or off.
e) Why is entrusting users to do key escrow risky?
The user may not do it, may not be able to find it and if fired, may refuse to give it up, locking up all the data on the computer.
f) In what sense is encryption usually transparent to the user?
Encryption usually is fully transparent to the PC user. As long as you know the password for your computer, you can work with encrypted directories and files exactly as you do with unencrypted directories and files.
g) Why is this attractive?
It is attractive because once the user is logged in, he or she can see all encrypted data.
h) Why is this dangerous?
Encryption is only as good as the security of accessing a computer; if a computer is poorly secured with inadequate or no passwords, encryption will not matter as the attacker has full access to the PC.
i) What must users do to address this danger?
Users must have strong login passwords.
j) How does encryption make file sharing more difficult?
File sharing requires files to be unencrypted prior to sharing; adding encryption adds another step to the file sharing processes.
Data Loss Prevention
19. a) What is Data Loss Prevention (DLP)?
Data loss prevention (DLP) is a set of policies, procedures, and systems designed to prevent sensitive data from being released to unauthorized persons.
b) Are there some types of data that are too risky to collect?
Yes, health care records, credit card information, or social security numbers (SSN) may have additional legal requirements related to proper storage and handling. Loss of these data may lead to enormous liabilities. The potential cost of being sued may outweigh the benefit of having them on file.
c) In your judgment, do most organizations adequately protect their data? Why?
Student answers will vary.
d) What is PII? Please give a couple examples of PII.
Personally identifiable information can be used to uniquely identify a person. Examples include name, SSN, driver license number, tax ID number, address, photo, etc.
e) What is data masking?
Data masking obscures data such that it cannot identify a specific person, but remains practically useful.
Web Scraping with Yahoo Pipes
20. a) Could web scraping be a threat to a corporation? Why?
Yes, a competitor could use web scraping to harvest product information, pricing, recommendations, etc. This could result in a lost competitive advantage.
b) What are mashups? Give an example.
Mashups combine content from multiple websites. Student answers will vary. One example would be Google Maps and real estate listings.
c) What is the difference between a spider and a web scraper?
Spiders or crawlers navigate the Web by following hyperlinks from one page to another. Web scrapers, or web data extractors, extract only small parts from webpages, and then aggregate the extracted data from various webpages.
d) Is web scraping ethical, legal, criminal? Why?
There are some unresolved legal issues related to web scraping. Although web scraping may be against the terms of use of some websites, we don’t know if it is illegal, or not. It's also unclear whether or not it is ethical.
Information Triangulation
21. a) How are linking attributes used to connect disparate databases?
Attributes from one database match, or closely match, attributes in a second database, For example, a person’s name from a public voter list could be associated with medical data through linking attributes.
b) Explain information triangulation?
Data from multiple sources can be combined to identify individuals in a form of information triangulation. The combination of two compliant “anonymous” datasets can be used to create a third dataset that is noncompliant, and possibly against the law.
c) What are the odds of correctly identifying a person based on their ZIP code, date of birth, and gender? Why?
Professor Latanya Sweeney at Carnegie Mellon University combined data from a public voter list with seemingly anonymous medical data. She found that it was possible to correctly re-identify 87 percent of individuals using ZIP code, birth date, and sex.
d) What is profiling?
Profiling uses statistical methods, algorithms, and mathematics to find patterns in a dataset which uniquely identify an individual.
Document Restrictions
22. a) What is DRM? (Do not just spell out the acronym.)
Digital Rights Management restricts what people can do with certain types of data, such as copyrighted material (music, books, pictures, etc.).
b) Why is DRM desirable?
DRM is desirable because it can be used to protect trade secrets, sensitive personal data, and copyrighted material.
c) Give some examples of use restrictions that a company may wish to impose on a document.
A firm may only allow a person to download a document but may not be able to save it locally, print it, change it, or take other actions.
d) How can many DRM protections against unauthorized printing be circumvented?
If the information appears on the screen, screen capture software can make a copy to be put in another file for saving, printing, and other undesired actions.
e) What is the purpose of data extrusion management?
Data extrusion management attempts to prevent restricted data from leaving the firm without permission.
f) How can DLP systems be effective when placed at the gateway, on clients, and on a database server?
DLP systems can filter all incoming and outgoing content including e-mail, instant messaging, FTP transfers, unapproved web mail, and so on. Client content can be scanned before data are sent. This would prevent illicit content from being passed across the local network. DLP systems can actively search out, tag, and monitor sensitive data anywhere on corporate databases. They can also monitor access to sensitive data.
g) What is watermarking?
Watermarking is adding invisible information to a file that can be used to identify the source.
h) In what two ways can watermarking be used in data extrusion management?
Files can be watermarked for internal use only, and these files can be filtered out if attempts are made via e-mail attachments, FTP, or other means to send them outside the firm.
Also, each copy of a file can be given a different watermark. If a file is extruded to the outside world and then found again, the file can be traced back to its first receiver through the file’s specific watermark.
i) Why is it desirable to prevent a computer from working with removable media?
It is easy to steal data through removable media.
j) Why should restrictions on removable media be enforced technologically?
Non-technological restrictions on the use of removable media are almost impossible to enforce.
k) Why have document protections not been used heavily in organizations?
They are very difficult to enforce and companies usually restrict access and functionality from documents in uncomfortable ways.
Employee Training
23. a) Why do employees have to be trained about data security?
Yes, most employees must be trained on data security policies and procedures. Oftentimes, employees are unaware that they are violating data security policies. They may be unaware of how their actions could lead to data loss.
b) Do you know someone who has posted information about work on their blog or social networking site? Was it positive or negative?
Answers will vary. It’s not uncommon to have most students respond that they know someone that has posted negative comments about their employer.
c) From a security point of view, do you think social networking sites have made corporations more, or less secure?
Answers will vary. In general, social networking has improved communication, and connected people in unique ways. However, it has also created unique security problems that did not previously exist. In this sense, it has made it less secure.
Data Destruction
24. a) Why is it important to destroy data on backup media and PCs before discarding them or transferring them to someone else?
If they contain sensitive information, this information may be used subsequently by unauthorized parties.
b) What is the difference between basic file deletion and wiping? Basic file deletion happens when you empty the Recycle Bin. The pointers referring to certain sectors are removed, but the data in those sectors remain. Only the reference to the sectors has been removed. The file has been logically, but not physically, removed. Wiping or clearing, is logically and physically erasing data so that it is unrecoverable. Even recovery software cannot restore files if they have been securely deleted. The hard drive remains usable, but prior data cannot be recovered.
c) Is it safe to wipe a hard disk and then give it to someone else? Why, or why not?
Yes, secure file deletion, known as wiping or clearing, is logically and physically erasing data so that it is unrecoverable. Even recovery software cannot restore files if they have been securely deleted. The hard drive remains usable, but prior data cannot be recovered. You could safely give it to someone else without worrying about them recovering your data.
d) What does degaussing do?
It demagnetizes the media.
e) Name some effective methods of data destruction?
Media can be media shredded, melted, or degaussed (demagnetized).
f) How can optical disks be destroyed? The best way to destroy an optical disk is to destroy it physically. It is recommended to run a disk through a shredder.
1. a) What is the difference between data and information?
Data are raw facts. Information is meaning extracted from data.
b) How can data be protected while it is being transmitted?
It can be encrypted. (e.g., using a cryptographic system).
c) How can data be protected while it is being processed?
Data can be protected by making sure applications are securely coded and hosts are hardened.
d) What are some ways that data can be attacked when it is stored?
It can be accessed by unauthorized persons, destroyed, copied without permission, and taken outside the organization (i.e., data loss).
e) How can data be protected while it is being stored?
It can be properly backed-up, encrypted, and when necessary, destroyed. Restrictions on access to the data can also be implemented while it is being stored.
Data Protection: Backup
The Importance of Backup
2. a) List the ways in which data can be lost, adding some of your own if you can.
Data can be lost by mechanical failure, environmental casualties, malware, lost or stolen devices, and human error.
b) How does backup ensure availability?
Backup will ensure availability because you will be able to still access your files from backup copies, even if your primary hard disk fails.
c) Have you ever had to use a backup to restore a file? Explain.
Student responses will vary.
Scope of Backup
3. a) Distinguish between file/directory data backup and image backup.
File/directory data backup copies data (not programs, registry settings, configurations). Image backup copies data and all those other things.
b) Why is file/directory backup attractive compared with image backup?
File/directory backup is more attractive compared to image backup because it takes up less storage space and is much faster.
c) Why is image backup attractive compared with file/directory data backup?
Image backup is attractive because it requires minimal additional work to restore a functioning, fully capable PC.
d) What is shadowing?
Shadowing frequently records a backup copy of each file actively being worked. If there is a failure, little will be lost.
e) What is the advantage of shadowing over file/directory data backup?
The advantage of shadowing is that it allows for more current file changes to be restored.
f) How is shadowing limited?
Shadowing is limited because when the capacity of the storage is exceeded, the oldest files are deleted first.
Full versus Incremental Backups
4. a) Why don’t most companies do full backup every night?
Full backups take a long time and thus companies usually only conduct full backups weekly.
b) What is incremental backup (be precise)?
Incremental backups only backup data that has changed since the most recent full backup.
c) A company does a full backup one night. Call this backup Cardiff. On three successive nights, it does incremental backups, which it labels Greenwich, Dublin, and Paris. In restoration, what backups must be restored first and second?
Cardiff, then Greenwich. (Dublin and Paris come next.)
Backup Technologies
5. a) What are the advantages of centralized backup compared with local backup?
Centralized backup alleviates the key problems associated with local backup, which are:
Limited ability to enforce backup policy
Limited ability to audit which computers were backed up per policy, how backups were done, or how data was protected.
b) What is CDP (do not just spell out the acronym)?
CDP is continuous data protection. This is where two sites backup each other
c) Why is CDP attractive?
CDP is attractive because other sites can take over very quickly in case of a disaster, with little data loss.
d) Why is it expensive?
CDP is expensive because ample bandwidth is needed between CDP sites to allow the real-time backup of data.
e) Why is backup over the Internet to a backup storage provider attractive for client PC users?
The main reason for this being attractive is because it is very convenient.
f) What security risk does it create?
There is the concern that the company owning the PC loses control over its data, which is a very large security risk.
g) What is mesh backup?
Mesh backup is peer-to-peer backup onto other client computers. It sends backup data in parcels to many other client PCs.
h) What are its technical challenges?
First, the mesh backup operation must not slow down the computer on which packets are being written, or from which packets are being retrieved. Second, specific client PCs are not always available for packet retrieval, so parcels need to be sent out redundantly. The most difficult technical problem is security. When a client PC receives a backup parcel, its user must not be able to read, modify, or delete it.
i) Why is mesh backup desirable?
Mesh backup is desirable because it could make client PC backup automatic and, thus, eliminate the human factor in failing to conduct regular backups. It also utilizes corporate PC power that is often underused, compared to expensive separate backup hardware.
Backup Media and RAID
6. a) Why is magnetic tape desirable as a backup medium?
Magnetic tape can store vast amounts of data at the lowest cost per bit of any backup medium.
b) Why is tape not desirable?
Tape is not desirable because it is painfully slow and there are many different tape formats and readers (not much standardization like optical media).
c) Why is backup onto another hard drive attractive?
This method is attractive because it is a very fast method of backup.
d) Why is it not a complete backup solution?
This is not a complete solution because it could also be lost if computer is stolen or damaged in a fire. This method is also too expensive for long-term storage.
e) How can this limitation be addressed?
Many companies use a hybrid backup method, using additional hard drives for storage for as long as possible then transferring to tape at a pre-determined time or data size.
f) How much data can be stored on a dual-layer DVD?
Up to 8GB.
g) What is the advantage of burning backup data onto optical disks?
The advantage would be that mostly all users have optical disk burners.
h) Is storing backups on optical disks for several years likely to be safe?
Probably not, because the life of optical disks is still unknown and probably is short.
Disk Arrays—RAID
7. a) How can disk arrays ensure data reliability and availability?
A system using an array of drives increases reliability because redundant data are stored on multiple disks. Failure of a single drive in the array would not precipitate data loss. An array of drives can also increase read-write performance. Disk performance is increased because data can be written to, or read from, multiple disks simultaneously.
b) Explain RAID 0.
A RAID 0 configuration increases data transfer speeds and capacity by writing simultaneously to multiple hard disks. Writing data across multiple disks is known as striping. The striped set of disks is fast, but offers no reliability. If one of the drives fails, data on all disks are lost.
c) Explain RAID 1.
A RAID 1 configuration, the client operating system writes data to both the primary hard drive and the backup hard drive at the same time. No striping is used, so data transfer speeds remain approximately the same. Storage capacity also remains the same because the additional drive is just a mirror of the primary drive.
d) Explain RAID 5.
A RAID 5 configuration stripes data across multiple disks to increase data transfer speeds. Reliability is provided by parity bits that enable reconstruction of data stored on other disks. A RAID 5 configuration can recover from a single drive failure, but not a multi-drive failure.
Computing Parity
8. a) What is parity?
Parity bits enable reconstruction of data stored on other disks in case of failure. Parity bits are stored on disks such that they can be used to reconstruct the original parts of any lost disk in the array.
b) How does the XOR operator work?
The XOR parity bit will be a 1 if one or the other bits is a “1” but not both bits are 1s. The parity bit will be a 0 if both bits are either “1” or “0.”
c) How can parity be used to restore lost data?
Suppose Disk 3 (of 3) experienced complete failure. Data from Disk 1 (Part 1, Part 3, and Parity 5&6), together with data from Disk 2 (Part 2, Part 5, and Parity 3&4) could be used to recalculate the lost data on Disk 3 (Part 4, Part 6, and Parity 1&2). No data would be lost. After all calculations are done, the data on new Disk 3 will be identical to the data before the fire.
d) How long would it take to recalculate the data on a lost disk?
It depends on the number of disks, size of the disks, read/write speeds, etc. It could take anywhere from a several hours to several days. Rebuild times vary widely.
9. a) What are the advantages of RAID 5 over RAID 1?
A small amount of storage capacity is lost by storing the parity bits (RAID 5), but it is much less than it would be if the entire array were mirrored (RAID 1). The recovery/rebuild time on RAID 5 would be much longer than on RAID 1. No recalculation of the lost data is necessary in RAID 1.
b) Which RAID level discussed in this chapter has the fastest read-write speeds?
RAID 0
c) Is RAID 5 appropriate for home users? Why, or why not?
The typical consumer fairly price conscious and not overly worried about data loss. They prefer to not think about their brand new computer failing. They really don’t want to pay for two additional hard disks. RAID 5 would be great for home users, but it is likely too expensive and difficult to configure. The tech-savvy end user could have a RAID 5 configuration at home, but most end users won’t.
Data Storage Policies
10. a) What should backup creation policies specify?
Backup creation policies should specify what data should be backed up, frequency of backups, restoration test periodicity, and other related guidance.
Policies should address different types of data and computers to ensure the right backup is provided for the resource.
b) Why are restoration tests needed?
Restoration tests are needed because if the data is important enough to spend precious time and resources to backup, it needs to be available when called upon. Not testing will almost guarantee some failure that could have been addressed with simple testing.
c) Where should backup media be stored for the long term?
Backup media should be stored at another site.
d) What should be done about backup media until they are moved?
Backup media should be stored in a fireproof and waterproof safe until they are moved.
e) Why is the encryption of backup media critical?
Encryption of backup media is critical because backup data can be lost or stolen; encrypting the data protects the company from expensive losses of PII or trade secrets.
f) What three dangers require control over access to backup material?
The dangers of loss, stolen, or damaged backup material require control over access to the data
g) If Person A wishes to check out backup media, who should approve this?
The manager of the person requesting media should approve the checkout.
h) Why are checkouts of backup media suspicious?
Checkouts of backup media are rare, so every checkout should be examined carefully. There must be a valid operational reason for retrieving the backup media.
i) Why should business units and the legal department be involved in creating retention policies?
There are many business and legal requirements on the retention of certain types of data; these departments should be involved in creating retention policies.
j) What should backup audits include?
Backups, like all other processes, require audits to make sure the established policies are being followed. Backup audits should examine backups for compliance with policy, including tracing what happened to samples of data that should have been backed up.
E-Mail Retention
11. a) Why is retaining e-mail for a long period of time useful?
Retaining e-mail is useful in that it provides a history to be searched.
b) Why is it dangerous?
It is dangerous because if it can be searched, info in the e-mail could be used against you.
c) What is legal discovery?
Legal discovery is the process where a firm must provide records related to a lawsuit, including e-mails.
d) What are courts likely to do if it would be very expensive for a firm to discover all of its e-mail pertinent to a case?
Courts do not care; the firm will have to pay to get the e-mails recovered.
e) What can happen if a firm fails to retain required e-mail?
A firm can be fined or lose a lawsuit if they fail to retain e-mail.
f) What is accidental retention?
Accidental retention is when e-mail or other files are located on backup tapes when they were thought to be deleted.
g) How long can third-party e-mail providers keep your e-mails?
Indefinitely.
h) Is there a specific law that specifies what information must be retained for legal purposes?
No, there are multiple laws that specify what information must be retained.
i) What two requirements in the U.S. Rules of Civil Procedure are likely to cause problems for firms who do not have a good archiving process?
In the initial discovery meeting, which occurs shortly after a lawsuit begins, the defendant must be able to describe what information it has and how it will provide it. This requires a good in-place archiving system.
The firm must be able to put a hold on all destruction of potentially relevant information if the possibility of a lawsuit is foreseen. It needs a good in-place archiving system to be able to do this.
j) Why is message authentication important in an archiving system?
Without this, there is no way to prove who sent a particular e-mail.
k) Comment on a corporate policy of deleting all e-mail after 30 days.
Deleting e-mail after 30 days will inevitably get the firm in big trouble, fines, and the possibility of losing lawsuits.
User Training
12. a) Are e-mail messages sent by employees private?
No e-mails sent over corporate resources should be considered private.
b) What should employees be trained not to put in e-mail messages?
Employees should be trained not to say anything in e-mail that they would not want to see in court.
Spreadsheets
13. a) Why is spreadsheet security an IT security concern?
Spreadsheets are a major focus of new compliance regulations resulting from the Sarbanes–Oxley Act of 2002. Spreadsheets are highly concentrated with PII and are used for many proprietary financial calculations that would be disastrous to have hacked or deleted. Also, manipulation of spreadsheets is a key technique of criminals attempting fraud.
b) What two protections should be applied to spreadsheets?
The two protections are extensive testing for errors and fraud indicators and the use of spreadsheet vault servers.
c) Briefly list the functions of a vault server.
Spreadsheet vault servers offer the following:
• provide strong access control, authentication, authorizations, and auditing
• limit access to what a particular user can do and see in a spreadsheet
• extensive auditing
• version control of files
• cryptographic protection between PC and server
• active detection tools looking for policy violations
• strong management tools to state and update policies
d) Comment on vault server authorizations.
Vault server authorizations go beyond what a person can do with a file. They also limit what files a user can see on the spreadsheet (such as the user can see values in the cells but not the formula that created them).
e) Describe vault server auditing.
Vault server auditing includes check in/check out of files and individual cell changes. Individual cell change logs can be used for forensic analysis when required.
Database Security
14. a) What is a relational database? Explain.
Relational databases store their data in relations, commonly referred to as tables. Entities are types of objects that represent persons, places, things, or events. Entities are things (nouns) you want to store information about. Attributes—columns in the table—are characteristics (adjectives) about the entity that you want to collect.
b) Why would a database administrator want to restrict access to certain tables?
For example, an accountant at a hospital would not need to see tables containing patient medical records. Even though the accountant may be an honest and hard-working employee, his job does not require access to tables containing patient medical records. Conversely, a medical doctor would not need access to tables containing financial data.
c) Why would a database administrator want to restrict access to certain columns?
For example, only a few employees might be allowed to retrieve information in the salary column. Other employees could read other attributes in the Employee table, but not salaries.
d) Why would a database administrator want to restrict access to certain rows?
For example, access to employee data can be restricted to rows for each department. Each department could view their own employee records, but not the records for other departments.
e) How would limiting data granularity protect the underlying database?
For instance, in analyzing data for personnel, privacy concerns may restrict searches to being no more detailed than sums and averages at the department level.
f) What is a data model?
A data model consists of entity names, attributes, and the structure of relationships between entities.
Database Access Control
15. a) What is a DBMS?
A database management systems (DBMS) such as Microsoft SQL Server,
MySQL, IBM DB2, and Oracle can manage database structures, and restrict access to individual databases.
b) Can a DBMS manage multiple databases? Why?
Yes, a DBMS (like Microsoft Access) can access many different databases. The DBMS simply manages access to the database.
c) How can validation protect against a SQL injection attack?
Incoming data can be validated by making sure they are in the expected data type (e.g., text, integer, or binary), size (e.g., 32 bits, 10 characters, or less than 5 KB), or format (e.g., DD/MM/YY or (555)555-5555).
d) How can sanitation protect against a SQL injection attack?
Incoming data can be sanitized to remove unacceptable characters that could be used to manipulate the SQL statement.
Database Auditing
16. a) What types of database events should be audited?
Logins, changes, warnings, exceptions, and special access should be audited.
b) How could SQL triggers be used to secure a database?
Triggers can also be used to implement audit policies and detect noncompliance with security policies.
c) What is a DDL trigger?
Data Definition Language (DDL) triggers can be used to produce automatic responses if the structure of the database has been altered.
d) What is a DML trigger?
Data Manipulation Language (DML) triggers can be used to produce automatic responses if data have been altered.
e) What type of sensitive data exists at your organization?
Answers will vary.
Database Placement and Configuration
17. a) What is a multi-tiered architecture? Why is it important?
A multi-tiered architecture separates the presentation (webserver), application processing (middleware server), and database management (database server) functions. It protects other layers by separating functions. If one layer is compromised, it won’t necessarily compromise the other layers.
b) How could a multi-tiered architecture stop or mitigate the effects of an attack?
A tiered architecture provides a greater level of protection to the database because vulnerabilities or attacks on one layer won’t necessarily affect other layers. For example, a DoS attack against the webserver won’t overwhelm the database server and shut down the database.
c) Why is changing the default database listening port important?
Attackers use automated port scanners to look for databases running on known default ports. Changing the default port can slow down an attacker
Data Encryption
18. a) Why is encryption usually attractive for sensitive data from a legal standpoint?
From a legal standpoint, loss of encrypted data comes with negligible risk of exploitation and the costly responsibility to report lost unencrypted data.
b) How long must an encryption key be to be considered strong today?
An encryption key must be at least eight characters long to be considered strong today.
c) What happens if the encryption key is lost?
If the key is lost, this could be disastrous for the firm. People with legitimate access will also be locked out.
d) How do companies address this risk?
Companies address the risk of lost encryption keys by holding copies of the keys in escrow, either on site or off.
e) Why is entrusting users to do key escrow risky?
The user may not do it, may not be able to find it and if fired, may refuse to give it up, locking up all the data on the computer.
f) In what sense is encryption usually transparent to the user?
Encryption usually is fully transparent to the PC user. As long as you know the password for your computer, you can work with encrypted directories and files exactly as you do with unencrypted directories and files.
g) Why is this attractive?
It is attractive because once the user is logged in, he or she can see all encrypted data.
h) Why is this dangerous?
Encryption is only as good as the security of accessing a computer; if a computer is poorly secured with inadequate or no passwords, encryption will not matter as the attacker has full access to the PC.
i) What must users do to address this danger?
Users must have strong login passwords.
j) How does encryption make file sharing more difficult?
File sharing requires files to be unencrypted prior to sharing; adding encryption adds another step to the file sharing processes.
Data Loss Prevention
19. a) What is Data Loss Prevention (DLP)?
Data loss prevention (DLP) is a set of policies, procedures, and systems designed to prevent sensitive data from being released to unauthorized persons.
b) Are there some types of data that are too risky to collect?
Yes, health care records, credit card information, or social security numbers (SSN) may have additional legal requirements related to proper storage and handling. Loss of these data may lead to enormous liabilities. The potential cost of being sued may outweigh the benefit of having them on file.
c) In your judgment, do most organizations adequately protect their data? Why?
Student answers will vary.
d) What is PII? Please give a couple examples of PII.
Personally identifiable information can be used to uniquely identify a person. Examples include name, SSN, driver license number, tax ID number, address, photo, etc.
e) What is data masking?
Data masking obscures data such that it cannot identify a specific person, but remains practically useful.
Web Scraping with Yahoo Pipes
20. a) Could web scraping be a threat to a corporation? Why?
Yes, a competitor could use web scraping to harvest product information, pricing, recommendations, etc. This could result in a lost competitive advantage.
b) What are mashups? Give an example.
Mashups combine content from multiple websites. Student answers will vary. One example would be Google Maps and real estate listings.
c) What is the difference between a spider and a web scraper?
Spiders or crawlers navigate the Web by following hyperlinks from one page to another. Web scrapers, or web data extractors, extract only small parts from webpages, and then aggregate the extracted data from various webpages.
d) Is web scraping ethical, legal, criminal? Why?
There are some unresolved legal issues related to web scraping. Although web scraping may be against the terms of use of some websites, we don’t know if it is illegal, or not. It's also unclear whether or not it is ethical.
Information Triangulation
21. a) How are linking attributes used to connect disparate databases?
Attributes from one database match, or closely match, attributes in a second database, For example, a person’s name from a public voter list could be associated with medical data through linking attributes.
b) Explain information triangulation?
Data from multiple sources can be combined to identify individuals in a form of information triangulation. The combination of two compliant “anonymous” datasets can be used to create a third dataset that is noncompliant, and possibly against the law.
c) What are the odds of correctly identifying a person based on their ZIP code, date of birth, and gender? Why?
Professor Latanya Sweeney at Carnegie Mellon University combined data from a public voter list with seemingly anonymous medical data. She found that it was possible to correctly re-identify 87 percent of individuals using ZIP code, birth date, and sex.
d) What is profiling?
Profiling uses statistical methods, algorithms, and mathematics to find patterns in a dataset which uniquely identify an individual.
Document Restrictions
22. a) What is DRM? (Do not just spell out the acronym.)
Digital Rights Management restricts what people can do with certain types of data, such as copyrighted material (music, books, pictures, etc.).
b) Why is DRM desirable?
DRM is desirable because it can be used to protect trade secrets, sensitive personal data, and copyrighted material.
c) Give some examples of use restrictions that a company may wish to impose on a document.
A firm may only allow a person to download a document but may not be able to save it locally, print it, change it, or take other actions.
d) How can many DRM protections against unauthorized printing be circumvented?
If the information appears on the screen, screen capture software can make a copy to be put in another file for saving, printing, and other undesired actions.
e) What is the purpose of data extrusion management?
Data extrusion management attempts to prevent restricted data from leaving the firm without permission.
f) How can DLP systems be effective when placed at the gateway, on clients, and on a database server?
DLP systems can filter all incoming and outgoing content including e-mail, instant messaging, FTP transfers, unapproved web mail, and so on. Client content can be scanned before data are sent. This would prevent illicit content from being passed across the local network. DLP systems can actively search out, tag, and monitor sensitive data anywhere on corporate databases. They can also monitor access to sensitive data.
g) What is watermarking?
Watermarking is adding invisible information to a file that can be used to identify the source.
h) In what two ways can watermarking be used in data extrusion management?
Files can be watermarked for internal use only, and these files can be filtered out if attempts are made via e-mail attachments, FTP, or other means to send them outside the firm.
Also, each copy of a file can be given a different watermark. If a file is extruded to the outside world and then found again, the file can be traced back to its first receiver through the file’s specific watermark.
i) Why is it desirable to prevent a computer from working with removable media?
It is easy to steal data through removable media.
j) Why should restrictions on removable media be enforced technologically?
Non-technological restrictions on the use of removable media are almost impossible to enforce.
k) Why have document protections not been used heavily in organizations?
They are very difficult to enforce and companies usually restrict access and functionality from documents in uncomfortable ways.
Employee Training
23. a) Why do employees have to be trained about data security?
Yes, most employees must be trained on data security policies and procedures. Oftentimes, employees are unaware that they are violating data security policies. They may be unaware of how their actions could lead to data loss.
b) Do you know someone who has posted information about work on their blog or social networking site? Was it positive or negative?
Answers will vary. It’s not uncommon to have most students respond that they know someone that has posted negative comments about their employer.
c) From a security point of view, do you think social networking sites have made corporations more, or less secure?
Answers will vary. In general, social networking has improved communication, and connected people in unique ways. However, it has also created unique security problems that did not previously exist. In this sense, it has made it less secure.
Data Destruction
24. a) Why is it important to destroy data on backup media and PCs before discarding them or transferring them to someone else?
If they contain sensitive information, this information may be used subsequently by unauthorized parties.
b) What is the difference between basic file deletion and wiping?
Basic file deletion happens when you empty the Recycle Bin. The pointers referring to certain sectors are removed, but the data in those sectors remain. Only the reference to the sectors has been removed. The file has been logically, but not physically, removed. Wiping or clearing, is logically and physically erasing data so that it is unrecoverable. Even recovery software cannot restore files if they have been securely deleted. The hard drive remains usable, but prior data cannot be recovered.
c) Is it safe to wipe a hard disk and then give it to someone else? Why, or why not?
Yes, secure file deletion, known as wiping or clearing, is logically and physically erasing data so that it is unrecoverable. Even recovery software cannot restore files if they have been securely deleted. The hard drive remains usable, but prior data cannot be recovered. You could safely give it to someone else without worrying about them recovering your data.
d) What does degaussing do?
It demagnetizes the media.
e) Name some effective methods of data destruction?
Media can be media shredded, melted, or degaussed (demagnetized).
f) How can optical disks be destroyed?
The best way to destroy an optical disk is to destroy it physically. It is recommended to run a disk through a shredder.